RUSSE'2018: A Shared Task on Word Sense Induction for the Russian Language

نویسندگان

  • Alexander Panchenko
  • Anastasiya Lopukhina
  • Dmitry Ustalov
  • Konstantin Lopukhin
  • Nikolay Arefyev
  • Alexey Leontyev
  • Natalia Loukachevitch
چکیده

The paper describes the results of the first shared task on word sense induction (WSI) for the Russian language. While similar shared tasks were conducted in the past for some Romance and Germanic languages, we explore the performance of sense induction and disambiguation methods for a Slavic language that shares many features with other Slavic languages, such as rich morphology and free word order. The participants were asked to group contexts with a given word in accordance with its senses that were not provided beforehand. For instance, given a word “bank” and a set of contexts with this word, e.g. “​bank is a financial institution that accepts deposits” and “river ​bank is a slope beside a body of water”, a participant was asked to cluster such contexts in the ​unknown in advance number of clusters corresponding to, in this case, the “company” and the “area” senses of the word “bank”. For the purpose of this evaluation campaign, we developed three new evaluation datasets based on sense inventories that have different sense granularity. The contexts in these datasets were sampled from texts of Wikipedia, the academic corpus of Russian, and an explanatory dictionary of Russian. Overall 18 teams participated in the competition submitting 383 models. Multiple teams managed to substantially outperform competitive state-of-the-art baselines from the previous years based on sense embeddings.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RUSSE: The First Workshop on Russian Semantic Similarity

The paper gives an overview of the Russian Semantic Similarity Evaluation (RUSSE) shared task held in conjunction with the Dialogue 2015 conference. There exist a lot of comparative studies on semantic similarity, yet no analysis of such measures was ever performed for the Russian language. Exploring this problem for the Russian language is even more interesting, because this language has featu...

متن کامل

Multilingual Word Sense Discrimination: A Comparative Cross-Linguistic Study

We describe a study that evaluates an approach to Word Sense Discrimination on three languages with different linguistic structures, English, Hebrew, and Russian. The goal of the study is to determine whether there are significant performance differences for the languages and to identify language-specific problems. The algorithm is tested on semantically ambiguous words using data from Wikipedi...

متن کامل

Hatred as a Moral and Ethical Conception in Russian Society

The present paper deals with the national specifics of the assessment aspect in the meaning of the words. A modern scientific paradigm considers the language as a cognitive tool of understanding the world and keeping and representing people’s experience and values which reflect the people’s vision of the world (“the world picture). Usually linguistics understands the language ...

متن کامل

Word sense induction using word embeddings and community detection in complex networks

Word Sense Induction (WSI) is the ability to automatically induce word senses from corpora. The WSI task was first proposed to overcome the limitations of manually annotated corpus that are required in word sense disambiguation systems. Even though several works have been proposed to induce word senses, existing systems are still very limited in the sense that they make use of structured, domai...

متن کامل

unimelb: Topic Modelling-based Word Sense Induction

This paper describes our system for shared task 13 “Word Sense Induction for Graded and Non-Graded Senses” of SemEval-2013. The task is on word sense induction (WSI), and builds on earlier SemEval WSI tasks in exploring the possibility of multiple senses being compatible to varying degrees with a single contextual instance: participants are asked to grade senses rather than selecting a single s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018